This report was generated using iSHARC.For additional downstream analyses and relevant tools, please refer to COBE.

QC metrics and filtering

Selected percentiles for QC metrics

  • nCount_RNA: The number of transcripts detected per cell.
  • pct_MT: The percentage of reads originating from the mitochondrial genes.
  • nCount_ATAC: The number of unique nuclear fragments.
  • TSS_Enrichment: The ratio of fragments centered at TSS to those in TSS-flanking regions.
  • Nucleosome_Signal: The approximate ratio of mononucleosomal to nucleosome-free fragments.
Number of joint called cells by Cellranger-ARC: 14645
nCount_RNA pct_MT nCount_ATAC TSS_Enrichment Nucleosome_Signal
0% 501.00 0.0000000 1001.00 1.500772 0.3461538
2.5% 571.00 0.0000000 1091.00 2.220207 0.4774588
25% 1044.00 0.0375270 1957.25 2.781574 0.5805203
50% 1648.00 0.1251096 3335.50 3.138325 0.6473854
75% 2712.75 0.2810257 6004.00 3.576497 0.7283766
97.5% 5190.00 0.7372309 11450.15 4.775505 0.9187123
100% 5759.00 0.8659508 12481.00 7.669254 1.0233384

Violin plots for selected QC metrics

NOTE: if the second-round QC filtering is enabled in the configuration file, the violin plots below display the distribution of QC metrics following the application of the second-round filtering criteria.

Second-round QC filtering assessment

The automatic QC metric cutoffs are determined based on the preset thresholds and the corresponding median ± 3*MAD (Median Absolute Deviation), as outlined below. The table displays the number of cells retained after applying the corresponding QC filters. The subsequent results for the cells in this report depend on whether the second-round QC filtering is enabled in the configuration file.

  • nCount_RNA: [max(nCount_RNA_min, median - 3MAD), min(nCount_RNA_max, median + 3MAD)]
  • nCount_ATAC: [max(nCount_ATAC_min, median - 3MAD), min(nCount_ATAC_max, median + 3MAD)]
  • pct_MT: < min(pct_MT_max, median + 3*MAD)
  • TSS_Enrichment: > max(TSS_Enrichment_min, median - 3*MAD)
  • Nucleosome_Signal: < min(Nucleosome_Signal_max, median + 3*MAD)
1st_joint_filtered 2nd_filtered_by_RNA 2nd_filtered_by_ATAC 2nd_filtered_by_Both
Number of cells 14645 10312.0 8728.0 7538.00
Fraction of 1st filtered cells 1 0.7 0.6 0.51

Cell cycle assessment

  • Exercise caution when analyzing cells undergoing differentiation processes (e.g., hematopoiesis).
  • PCA plots are provided both before and after regressing out the effects of the cell cycle. Only cell cycle-related genes are used for the PCA analysis.

Integration and Clustering

ATAC, RNA and integrated WNN clustering

Modality weights for integrated WNN clusters

Higher ATAC weights indicate that the epigenome has a greater influence on the corresponding cell clusters. Similarly, higher RNA weights suggest that the transcriptome plays a more significant role in the corresponding cell clusters.

Clustering changes across individual and integrated modalities

The interactive Sankey plot below enables the tracking of clustering changes between individual and integrated modalities.

Auto cell annotation

Annotation using publicly available references

The Blueprint/ENCODE reference from the R package celldex is used for automatically annotating each cell cluster using SingleR.

Inferring tumor and normal cells

Copykat is employed to classify cells into the following categories: aneuploid (tumor cells); diploid (stromal normal cells); not.defined (Cells taht cannot be to be predicted by copykat); not.predicted (cells excluded from copykat prediction).

Cluster-specific genes

Top 5 upregualted genes per WNN cluster

The ATAC peaks linked to these top 5 genes can be found in the file lymphoma_14k_top5_DEGs_linked_peaks.csv. This file will be empty if no genes passed the preset cut-offs.

Functional enrichment analysis (GO) for WNN cluster-specific genes

Functional enrichment analysis (KEGG) for WNN cluster-specific genes

Cluster-specific regulatory regions

Top 5 enriched TF motifs per WNN cluster. * NOTE: The heatmap is generated from the combined top-enriched TF motifs, saved in Seurat_object@assays$ATAC@misc$DARs_motif_hm. Values with -log10(p_adj) > 10 are capped at 10. The heatmap will be empty if no motifs are enriched or if they do not pass the preset cutoffs.

Gene Regulatory Network

The Pando is employed for GRN analysis:

  • Edges Color: Darkgrey (Inhibitory); Orange (Activating)
  • Nodes Color: Lightgrey (DEG); Brown (TF); Brown with Black circle (DEG & TF)
  • Nodes Size : Based on their centrality in the graph
  • It will be empty if no GRNs are identified with preset cutoffs.

GRN based on combined WNN cluster-specific genes

GRN based on individual WNN cluster-specific genes

List of main outputs

Note: additional tables and figures are available in the directory of individual samples: workdir/individual_samples/lymphoma_14k!!

Three depth of Seurat objects

  • lymphoma_14k_initial_seurat_object.RDS: The initial Seurat object containing RNA and ATAC assays, with peaks re-called using MACS. No additional analyses are performed.
  • lymphoma_14k_vertically_integrated_seurat_object.RDS: This Seurat object includes the optional second-round QC, cell cycle correction, normalization, and clustering for RNA and ATAC data. RNA and ATAC data are integrated using WNN.
  • lymphoma_14k_extended_seurat_object.RDS : The extended Seurat object contains the results for additional assays based on the WNN integrated clusters, including intermediate and final results for all assays, particularly:
    • Seurat_object@assays$ATAC: The ATAC assays performed by calling peaks with MACS.
    • Seurat_object@assays$SCT@misc$DEGs: The WNN cluster-specific differentially expressed genes (DEGs)
    • Seurat_object@assays$ATAC@misc$DARs: The WNN cluster-specific differential accessible regions (DARs)
    • Seurat_object@assays$ATAC@misc$DARs_motif: The motif enrichment results for top cluster-specific DARs (p_adj < 0.005 )
    • Seurat_object@assays$ATAC@links: The peaks linked to the top 5 WNN cluster-specific DEGs
    • Seurat_object@assays$ATAC@links: The peaks linked to the top 5 WNN cluster-specific DEGs
    • scMultiome@misc$Combined_DEGs_GRN: tbl_graph object for combined DEGs
    • scMultiome@misc$Cluster_DEGs_GRN: tbl_graph objects for WNN cluster-specific DEGs

Tables

  • lymphoma_14k_extended_metaData.csv : A metadata table that includes QC metrics, cell cycle phases, clustering results for RNA, ATAC, and WNN integration, SingleR annotations, and CopyKAT tumor cell predictions for each cell.
  • lymphoma_14k*DEGs_GRN_Modules.csv : CSV files for the DEGs-based GRN modules.

RMD file

  • lymphoma_14k_QC_and_Primary_Results.Rmd: The Rmd file for generating this HTML report.